Method for extracting digests, reformatting, and automatic monitoring of structured online documents based on visual programming of document tree navigation and transformation

ABSTRACT

A method for extracting digests, reformatting, and automatic monitoring of structured online documents based on visual programming of document tree navigation and transformation is provided for structured online documents such as HTML, XML, SGML document, or any other document that has internal structure that can be represented by a tree A digest of an online document is a collection of fragments of this document which are of interest to a user. The system is based on a technique whereby a user selects a fragment of an online document shown in a source window and copies this fragment to the target window that contains the reformatted digest. The system generates a sequence of web site navigation commands, online document tree navigation commands, and “Copy Fragment” commands that cause the assembly of the reformatted digest in the target window. The user can later ask the system to replay the generated commands, thus causing automatic creation of the reformatted digest of the changed version of the online document. Therefore, when content of the original document changes, the change is automatically propagated to the digest document. This allows implementation of a simple automatic monitoring of online documents or their reformatted digests. The digest document is usually much smaller than the original document, and usually it does not contain computationally intensive and bandwidth intensive multimedia elements such as graphics, sounds, applets, and scripts. This considerably lowers the bandwidth and processing power requirements for user agents that display document digests. Therefore digest documents can be displayed by user agents running on wireless and portable computing devices that have bandwidth and computational power limitations.

FIELD OF THE INVENTION

[0001] The present invention relates to a method for extracting digests,reformatting, and automatic monitoring of structured online documentsbased on visual programming of document tree navigation andtransformation. More particularly, the invention relates to a system andmethod whereby a user selects a fragment of an online document shown ina source window and copies this fragment to the target window, thesystem creates a sequence of commands that can reproduce this behaviorwhen applied to the new versions of the source documents downloaded fromthe information source, such as web site.

BACKGROUND OF THE INVENTION

[0002] Structured online documents, especially HTML and XML documentsavailable on the World Wide Web (WWW)have become very important in thepast few years. Such documents contain data which may be periodicallyupdated, wherein such updating does not substantially change the formatof presentation of such data.

[0003] These online documents usually are dynamically generated by theweb servers and they present data stored in online databases This dataperiodically changes, but since these documents are automaticallygenerated by computers, the presentation document structure remainssubstantially the same for relatively long periods of time.Additionally, even when the web page is updated manually, thepresentation document structure may remain substantially the same forrelatively long periods of time.

[0004] Examples of such frequently updated online documents include:stock quotes from brokerage web sites; prices of specific items fromonline commercial vendor sites and from online auction sites; localweather information from weather web sites; airline ticket informationprovided by airline or travel sites; shipment tracking information fromthe mail delivery companies; current news headlines from the newsorganizations web sites; latest press releases of a specific companyissued on their web site; bank account balances for an individual orcorporation from the bank web site.

[0005] While all this data may be of great interest to the user, it isoften accompanied by data that is unimportant or even irrelevant to aparticular user. This irrelevant data unnecessarily complicatescomprehension and interpretation of the relevant data and often leads tothe user missing important changes in the relevant data.

[0006] Examples of the data that may be unimportant to the user are:

[0007] 1. Stock quotes for a stock of interest to the user are oftenaccompanied by other data such as number of shares outstanding, openingand closing prices, earnings in the last quarter and so on. While theuser may need to check this data once every 2 or 3 months, the user isnot likely to want to see this data every time a current stock quote issought.

[0008] 2. Fluctuating price for an item in an online store thatinterests user may be accompanied with advertising for other items thatthe user has no interest in or it may be accompanied with productphotographs which user has already seen many times.

[0009] 3. Balances of the user's bank accounts may appear in separateonline documents (web pages) and be accompanied by the last 10transactions. The user, however wants to monitor only balances of allhis or her accounts in the bank so that every balance appears in a smallwindow unaccompanied by any other information.

[0010] In addition to this, if the user wants to monitor important data,he or she will find it necessary to push the browser “Reload” button toobtain the latest data from the remote database This requiresconsiderable manual effort and can be fatiguing even when monitoring oneonline document. The manual effort required for monitoring severalonline documents simultaneously is so great that it makes suchmonitoring very difficult, if not impossible to do on a regular basis.

SUMMARY

[0011] Online documents generated by online databases provide valuabledata that a user may want to monitor. However, this essentialinformation is often accompanied by large quantities of non-essentialand even irrelevant information, or information that rarely changes anddoes not need to be monitored.

[0012] Therefore, a method is needed that allows a user to automatemonitoring of essential data extracted from online documents whileignoring non-essential or irrelevant data.

[0013] In the remainder of this Section we present the state of the artin the technical area of this invention and show how this inventiondiffers from the state of the art.

[0014] HTML, Browsers, and DOM

[0015] HTML, and XML structured online documents are displayed using webbrowsers such as Navigator by Netscape® Communications(http://www.netscape.com) and Internet Explorer by Microsoft®corporation (http://www.microsoft.com/).

[0016] A web browser is used in the preferred embodiment of the presentinvention.

[0017] However, none of the browsers known to us can display a documentfragment in a separate window with no window treatments so thatirrelevant information is not seen by the user and this window takessmall space on user's screen. Also none of the browsers known to usimplement automatic refresh.

[0018] The present invention augments the browser behavior and it usesthe ability of the more advanced browsers to be controlled by otherapplications. Also the present invention uses the Document Object Model(DOM) to navigate the content of an online document represented as atree of nodes.

[0019] Web Site Server-Side Customizations

[0020] Most major web allow limited server-side customization of theircontent these days Examples are MyYahoo® (http://www.yahoo. com/),MyNetscape® (http ://www.netscape.com/), etc. These customizations arenothing more than accounts created for users on these web sites. Userssee the customized content when they login into their accounts on theweb site.

[0021] Web site customizations provide a limited choice of what can becustomized. For example, the user usually can select a portfolio ofstocks to be displayed, but he or she usually cannot select whatparameters are presented for a particular stock. Also usually suchcustomizations are limited to very few online data categories. Forinstance, user can monitor all U.S. stock using such customization, buthe or she cannot monitor, say, Brazilian stock even though online stockquotes for Brazilian stock may be available online.

[0022] Furthermore, creating user-customized web site content requirescomplicated and therefore expensive programming from the web sitemaintainers, so this option is not practical for smaller web sitesbecause of its price and complexity.

[0023] Finally, server-customized web pages are still shown in a regularweb browser window that has a lot of unnecessary window treatments anduser is still required to push the “Reload” button every time he wantsto update.

[0024] Using the present invention, the user can arbitrarily customizeand monitor any web page content and select any presentation format forthe customized content, and no programming is required both on webserver side and on the user side.

[0025] Online Data Providers

[0026] Several online services exist that can push certain online datasuch as stock quotes to the user's wired or wireless device such aspager or computer.

[0027] These services compare to the present invention in the same wayas server-side web site customizations, because they have the sameproblems: limited choice of content that can be monitored, no way toarbitrarily customize presentation of such content and what parametersare included, expensive server-side programming is required.

[0028] XML and XSLT

[0029] Several techniques exist that transform a higher level abstractdocument presentation to the lower level document presentation used forrendering the document. Most notable effort in this area is XSLTlanguage (http://www.w3c.org) that is used to write programs thattransform XML documents (http://www.w3c.org) to HTML documents that arerendered in a web browser.

[0030] These techniques do not cover the present invention because theyare used to synthesize lower level document presentation from the higherlevel document presentation but they do not change the content of thedocument. The present invention is primarily used to change the contentof the document without changing the level of abstraction used in thedocument presentation.

[0031] Related Patents

[0032] U.S. Pat. No. 5,530,852 to Meske, Jr., teaches how to build websites that store news articles and serve them to users through theInternet, providing categorization and search services. A typical newsarticle is a structured document that has a title, summary (profile),and body. However, the U.S. Pat. No. 5,530,852 teaches processing newsarticles in the web server space, and not in the client space. Also theU.S. Pat. No. 5,530,852 teaches programming of reformatting by a highlyskilled computer programmer, while the present invention teachescreation of reformatting script by non-programmer user.

[0033] U.S. Pat. No. 5,737,592 to Nguyen et al. teaches how to buildserver-side programs that receive queries from a web browser,automatically convert them to SQL queries, run these queries on adatabase, convert records returned by the database to HTML and send thisHTML back to the requester. The present invention is different from thispatent because it applies on the client side and not on the server sideand we are not concerned with generation of SQL queries.

[0034] U.S. Pat. Nos. 5,745,754 to Lagarde et al. and 5,752,246 toRogers et al. teach how to build server-side programs that useDistributed Integration Solution servers to perform extraction of datarequested by a user from databases, and presentation of this data inHTML. These teachings would be of use to a highly-skilled programmer whoprograms web applications in extracting and reformatting data in adatabase. But they are different from the present invention, because weteach how non-programmer user can create reformatting scripts on theclient side.

[0035] U.S. Pat. No. 5,774,123 to Matson teaches how to record asequence of navigation commands performed by a user on the web browserand how to later replay these commands causing the browser to repeat thenavigation session. The record-and-replay feature of this patent doesnot teach extracting digests of online documents, nor does this patentteach extracting document digests using document trees and displayingthe digests in a separate window.

[0036] U.S. Pat. No 5,799,304 to Miller teaches how a user agent canfilter, i.e. wholly display or wholly reject, a news article based oncriteria provided by the user. That is, it teaches how to make searchengines more intelligent by using agent technologies. This patent doesnot relate to extraction of document digests.

[0037] U.S. Pat. No. 5,890,152 to Rapaport teaches how to build a websearch engine that takes into account user characteristics such as IQ,etc., all stored in a personal profile database. This patent does notrelate to the present invention, because we are not concerned with usercharacteristics at all.

[0038] U.S. Pat. Nos. 5,895,476 and 5,903,902 to Orr et al. areconcerned with server side generation of online documents from thespecialized higher level representations of documents. This is differentfrom the present invention because the present invention applies on theclient side and it does not change the transformed document's level ofabstraction.

[0039] Accordingly, it is a problem in the art to automatically monitoruser-selected fragments of the online documents and to create scriptsthat perform such monitoring when such scripts are to be createdvisually by a user without requiring user to write a program of anykind.

SUMMARY OF THE INVENTION

[0040] From the foregoing, it is seen that it is a problem in the art toprovide a device meeting the above requirements. According to thepresent invention, a device is provided which meets the aforementionedrequirements and needs in the prior art.

[0041] Specifically, the device according to the present inventionprovides a method for extracting digests of structured online documents,and automatic monitoring of the said digests. A digest of an onlinedocument is a collection of fragments of this document which are ofinterest to a user. Creation of the scripts that perform the said digestextraction and monitoring employs visual programming of the onlinedocument tree navigation and transformation. The disclosed method can beapplied to structured online documents such as HTML, XML, SGMLdocuments, or to any other online document that has internal structurethat can be represented by a tree.

[0042] More specifically, the system according to the present inventionis based on a visual programming whereby a user selects a fragment of anonline document shown in the source window and copies this fragment tothe target window that contains the reformatted digest. The systemaccording to the present invention generates a sequence of web sitenavigation commands, online document tree navigation commands, and “CopyFragment” commands that cause the assembly of the reformatted digest inthe target window. The user can later ask the system to replay thesequence of generated commands, thus causing automatic creation of thereformatted digest of the changed version of the online document.

[0043] Therefore, according to the present invention, when content ofthe original document changes and the script that creates the digest isrun, the change is automatically propagated to the digest document. Thisallows implementation of simple automatic monitoring of digests of theonline documents which occurs entirely in the user space, that is in theapplication that controls the user's browser.

[0044] The digest document is typically much smaller than the originaldocument, and usually it does not contain computationally intensive andbandwidth intensive multimedia elements such as graphics, sounds,scripts, and controls. This considerably lowers the screen size,bandwidth and processing power requirements for user agents that displaydocument digests. Therefore, documents digests can be displayed by useragents that run on wireless and portable computing devices. Such deviceshave small screen, and their bandwidth and computational power resourcesare limited.

[0045] The preferred embodiment of the present invention is a computerprogram that is called WebTransformer™. It runs on Microsoft® Windows®32-bit operating systems and as of filing date it controls the MicrosoftInternet Explorer.

[0046] Vocabulary

[0047] Source Document and Source Window.

[0048] The source window typically contains a regular browser such asMicrosoft Internet Explorer. In this window the source online documentis shown Used to navigate to the web page of interest and to select afragment of this page to be monitored.

[0049] Target Document and Target Window.

[0050] The target window is where the digest of the source document isdisplayed. The digest of the source document that user monitors is alsocalled the target document. The target window is typically much smallerthan the source window and it does not have window treatments such asmenu bars and scroll bars, so that it is possible to have many suchwindow on one screen.

[0051] Command

[0052] Elementary instruction to perform operation on a document treethat can be recorded.

[0053] Script

[0054] A recorded or otherwise created sequence of commands.

[0055] How It Works

[0056] The user typically performs the following actions in order to usethe present invention.

[0057] First, the user browses documents in the source window and whenseeing a document of interest selects a fragment of the document thatconstitutes a digest. Selection is performed by clicking the desiredelement of the web page. This click is translated by the browser intothe address of the node in the document tree that represents the minimalHTML element that covers the clicked area.

[0058] The user can then use the arrow keys of a computer keyboard toextend, contract, or move sideways the selection. Other selection mouseclicks and keyboard keys may be used depending on the web browser.

[0059] When the user finishes selecting the fragment, the user invokesthe user interface “Copy” command that copies contents of the selectedfragment from the source window to the target window. Please note thattarget window does not have to be visible when source document fragmentis selected. The target window may become visible upon creation of thescript. Similarly, source window may be not visible when the script isreplyed.

[0060] In addition to that, according to the present invention theWebTransformer creates a script that records the source documentlocation, sequence of document tree navigation commands that leads fromthe tree root to the node that corresponds to the selected fragment, andthe “Copy Fragment” command.

[0061] The system can record all elements of user navigation includingentering User ID and Password or filling out and submitting other onlineforms that cause the desired navigation.

[0062] Finally, according to the present invention the user can ask theWebTransformer to run the script that has been created. The user canrequest a one-time execution of the script or automatic periodicexecution of the script according to a user-specified time table. Scriptexecution results in fresh (not from cache) download of the sourcedocument, navigating the source document tree to the selected tree nodeand copying the selected source document fragment to the target window.

[0063] Summary of Benefits

[0064] The present invention brings the following benefits to its user:

[0065] 1. User views and monitors only the fragments of online documentsthat are of interest to him or her, not the whole documents.

[0066] 2. User does not have to push the “Reload” button, it is done forhim or her automatically by the WebTransformer.

[0067] 3. Combination of typically small size of target windows andauto-refresh feature allows to monitor many (10-50) online documentssimultaneously without applying any manual effort.

[0068] 4. Since the document digest is small and it typically does notcontain large pictures or embedded programs (such as JavaScript, Java,ActiveX programs), the document digests download and execute much fasterthan the original documents.

[0069] 5. Since document digests are small in size, and since theyrequire less bandwidth and less computational power to display than theoriginal documents, the document digests can be successfully displayedon small-screen user agents that have bandwidth and computational powerlimitations, specifically on user agents that run on wireless devicessuch as cellular phones, pagers, wireless personal digital assistants(PDA), and so on. These devices' primary limitation is screen size, sothey would greatly benefit from the present invention.

[0070] Other objects and advantages of the present invention will bemore readily apparent from the following detailed description when readin conjunction with the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

[0071]FIG. 1 schematically shows two source documents, each shown in asource window, and a target document shown in a target window.

[0072]FIG. 2 shows a concrete example of source document from thefinancial web site contained in source window and the document digest ofthis document shown in a target window.

[0073]FIG. 3 shows a concrete example of source document obtained from ashipping company and digest of this document monitored in a targetwindow. It also shows several other target windows that monitor othersource web pages with their source windows hidden.

[0074]FIG. 4 shows a partial source document tree for the sourcedocument shown in FIG. 2.

[0075]FIG. 5 shows a WebTransformer script that extracts document digestfrom the source window and shows it in the target window in FIG. 2.

[0076]FIG. 6 shows a block diagram for client-server WebTransformersetup.

[0077]FIG. 7 shows a block diagram of communicating devices for use in awireless device application according to the present invention.

DETAILED DESCRIPTION OF THE INVENTION

[0078] Windows

[0079] In the preferred embodiment a user typically observes two windowsper instance of the WebTransformer script:

[0080] 1. Source Document Window. This window contains the source onlinedocument that is displayed using a regular web browser such as MicrosoftInternet Explorer. This window is used to navigate to the onlinedocument that will be monitored and to select a fragment of the onlinedocument to be monitored.

[0081] 2. Target Document Window. This window is where the digest of thesource document appears. This window is usually smaller than the sourcewindow and it typically has no window treatments such as menu bars,control box, or scroll bars.

[0082] When a WebTransformer script is recorded, source window and maybetarget window are displayed. When the recorded script is replayed, userhas an option of displaying both source and target window or only thetarget window. Typically user does not display the source window at thescript replay time.

[0083] If target document is assembled from several source documents,then several source windows may be displayed. However, eachWebTransformer script typically has only one target window associatedwith it.

[0084] The goal of this design is to keep target windows as small aspossible so that several such windows monitoring different documents canbe placed on the screen without overlapping each other.

[0085]FIG. 1 schematically shows two source documents in source windowsand one target document in the target window. Source document I isdisplayed in the source window 10 Source document 2 is displayed in thesource window 20. Target document is displayed in the target window 30.

[0086]FIG. 2 contains the actual screen shot of the workingWebTransformer. It shows the source window 10 on the left that containsthe source online HTML document from the web site at“http./www/quicken.com/” that contains a detailed stock quote forCyberCash® Inc. Note that the “Last Trade” digits “12¾” (30) arehighlighted to show that these digits constitute the document fragmentselected by the user.

[0087] The small window 20 on the right is the target window that showsthe target online document that contains the same digits “12¾” (40) thatconstitute the target document fragment that was copied from the sourcedocument fragment 30. The target window title contains the name of theWebTransformer script that created the target document and the time whenthe script was run last time.

[0088]FIG. 3 shows the web page (online document) 10, in this casedepicting a FedEx® Corp web page that is used to track air shipments. Auser selected web page fragment 30 that contains the latest event thathappened to the user's shipment. This fragment is copied to the targetwindow 20 where it is shown as the document fragment 40.

[0089] Also shown in FIG. 3 are unrelated WebTransformer target windows50, 60, and 70 that track other web sites. Specifically, window 50tracks stock quote taken from a financial services web site, window 60tracks a particular lot price from the online auction, and window 70tracks weather in New Jersey from a weather web site. The source windowsthat correspond to these target windows are hidden on instruction fromuser.

[0090] Source Document Tree and DOM

[0091] We use tree representation of the source online document increating the transformation script according to the present invention.In the document tree each logical unit of the document such asparagraph, table, heading, emphasis is represented by a node. Node A isa child on node B if and only if the document fragment represented bynode A is directly embedded into document fragment represented by nodeB.

[0092] The most popular implementation of the online document tree modelfor HTML and XML online documents is Document Object Model (DOM) (seehttp://www.w3c.org/ for details). Document Object Model is implementedin modern browsers such as Microsoft Internet Explorer ver 5 or NetscapeNavigator ver. 5. The preferred embodiment of this invention uses DOM asa source document tree model. Other embodiments of this invention canuse different tree models for representing the source document.

[0093]FIG. 4 shows partial document tree for the source document 10 fromFIG. 2 (complete tree is too big to show it on one page). The root ofthe tree contains BODY element 10 that represents body of the document.The B (for bold) node 20 represents HTML element B that contains theuser-selected document fragment 30 on FIG. 2. The path consisting fromtree nodes 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, and 41 leads from theroot of the tree to the tree node 20.

[0094] Creating the Script

[0095] A script that performs online document transformation accordingto this invention (also called WebTransformer Script, or WTS) is createdin the following manner.

[0096] A source document is displayed in the first window 10 of FIG. 1.The first window 10 is herein referred to as a source window 10.Transformed (target) document is displayed in a second window 30. Thesecond window 30 is herein referred to as a target window 30. Note thattarget window may be kept invisible until the script is created.

[0097] A user can select a source document fragment by clicking thedesired fragment using computer pointing device such as a mouse.Selected source document fragment is highlighted. Then, using keys of acomputer keyboard, user can expand or contract the selected fragment. InFIG. 1, a fragment 15 is shown as being selected.

[0098] Once the fragment 15 is selected, the user can copy the fragment15 to the target window 30 by selecting “Copy” user interface commandfrom the graphical menu of commands, and a copied fragment then appearsin the target window 30 as a target fragment 31. The user can thenproceed, for example, to another online document 20, select a fragment25 therein and copy it to another target location 32 in the targetwindow 30.

[0099] The script that downloads the source document and transforms itsfragment into the fragment in the target document is created accordingto the following rules:

[0100] 1. Add to the script the “Go To URL” command that causes thebrowser in the source window to navigate to the source document. Thelocation of the source document includes URL address. The locationinformation can also include additional data that needs to be passed tothe web server to cause displaying of the page selected by user, such aspost data and headers.

[0101] The command 10 from the sample WebTransformer script shown atFIG. 5 causes browser to navigate to address“http:www.quicken.com/investments/quotes/?symbol=cych”. This samplescript transforms the source document 10 at FIG. 2 to the targetdocument 40.

[0102] 2. Add to the script a sequence of “Go To Child” commands thattake us from the downloaded document tree root to the document tree nodethat represents document fragment selected by user for monitoring.

[0103] Creation of the command sequence starts with finding a tree nodethat corresponds to the document fragment selected by the user.WebTransformer asks DOM implementation to compute the minimal HTMLelement that covers the selection made by the user in document. Singlemouse click is treated as a selection of zero width.

[0104] Then we use parent links to walk up from the selected node to theroot node. While walking, we record the indices of nodes in theirparents, so that the recorded path can be walked again from the root,when the document is reloaded.

[0105] For instance, the commands 20, 21, 22, 23, . . . , 30, 31 on FIG.5 walk the tree node path from the root node 10 on FIG. 4 to theuser-selected node 20 and on the way they pass tree nodes 31, 32, 33, .. . , 39, 40, and 41.

[0106] 3. Add to the script the “Copy Fragment” command. Creating thescript in the case of multiple source pages requires “Copy Fragment”command to be qualified by the target ID at the target document.

[0107] For instance, in FIG. 5 “Copy Fragment” command 40 finishes thescript by copying the user-selected source document fragment to thetarget document.

[0108] The formal algorithm for the script creation is as follows:

[0109] Input: tree node selected-element that is a part of the sourcedocument tree

[0110] Output: the script object that is a list of commands.

[0111] 0 Create empty script object.

[0112] 1 Add “Copy Fragment” command to the script object.

[0113] 2. Set variable e that refers to the current tree node toselected-element.

[0114] 3. Do until e is not NULL

[0115] 3a If e. tag is equal to “BODY” or e has no parent then Exit thisloop

[0116] 3b Create “Go To Child” command object.

[0117] 3c Node p=e.parent

[0118] 3e. Compute integer ix which is equal to index of node e in thenode p.

[0119] Index of the first child is 0, index of the second child is 1,and so on.

[0120] 3f Store ix in the command.

[0121] 3g Add command before the first command at the script.

[0122] 3x. EndDo

[0123] 4. Add “Go To URL” command that navigates browser to theuser-selected source page before the first command at the script.

[0124] Recorded script can be saved in a computer file and later loadedfrom that file.

[0125] Running the Script

[0126] The user can instruct WebTransformer according to the presentinvention to run the created script or alternatively to run a scriptloaded from file. The WebTransformer according to the present inventionthen executes the sequence of commands contained in the script, thuscausing the source document(s) to be downloaded from the Internet, andfragment(s) of these documents to be selected and copied to the targetwindow. All this happens automatically according to the recorded script.

[0127] The user can either run the script once or instruct theWebTransformer according to the present invention to run the scriptautomatically according to a time table set by the user (for instance,every 5 minutes). The script can be run on the same desktop computerwhere it was created or the script can be transferred to anothercomputer (for example, by downloading, uploading or e-mailing it) andrun on another computer. The other computer may be another computerbelonging to the user or can be a server computer which can run thisscript on a request from a client.

[0128] Why the Tree?

[0129] Every time we reload the source document, there is no guaranteethat it will be the same as the previously loaded document or that itwill even be close to the previously loaded document. Many things canchange even in the relatively stable documents generated from onlinedatabases. (1) Advertising banners that appear on most web pages changeevery time the page is loaded, and they may have complicated internalstructure that is different for every ad that is displayed; (2) Certainnon-advertising items may substantially change too. For example, on FIG.2 there is a list of “Recent Headlines” Number of elements in this listand composition of this list may substantially change every few hours asnew headlines for the company appear and old headlines are removed Alsothe list of available site features (“Chart”, “Intraday Chart”, “News”,“Evaluator” and so on) changes approximately once every month as thesite implements new features and removes old features.

[0130] So to be able to find the user-selected fragment of the changedsource online document we need to rely on a document model such that analgorithm of getting to the user-selected fragment will be the leastaffected by changes in the other parts of the document. The DocumentTree is the document model that was selected for use in the presentinvention, because it provides good degree of independence of thetransformation script from the document changes.

[0131] Tree nodes and their children that are not on the path from theroot to the user-selected node may change and their change will notaffect the path to the user-selected element, so the script that locatesthis element will still work. For example, on FIG. 4 nodes 51 and 52 arelikely to contain the changing content, because they are related toadvertising banners that are often put into IFRAMEs. But these nodes arenot on the path from the root node 10 to the user-selected node 20, soeven if the entire content of these nodes changes, the transformationscript built according to the present invention still will be able tofind the user-selected element 20 in the new document tree

[0132] However, if nodes 51 or 52 on FIG. 2 are removed entirely, thenthe WebTransformer script will not be able to get to the user-selectednode 20. Therefore repeated running of these transformation scripts inorder to obtain an updated digest of the updated source online documentsubstantially relies on the assumption that the path from the root nodeto the user-selected fragment node will not change in the new document.

[0133] This typically is the case for the frequently updated onlinedocuments, because these documents are automatically generated from thesame template by a web server program which uses the same template fordynamic online document generation.

[0134] Client-Server WebTransformer

[0135] In the present invention, as described above, displaying of thedocument digest occurs in the same process and on the same computer thatruns the WebTransformer script and performs the transformation Undercertain circumstances it becomes necessary to separate the documentdigest displaying function from the document digest creation function,so that these functions may be executed on different computers. Then theprogram that displays the document digest is called WebTransformerclient and the program that performs the online document transformationaccording to the present invention is called WebTransformer server.

[0136] See FIG. 6 for schematic drawing of the client-server setup. TheWebTransformer client 10 sends a request to get the fresh documentdigest to the WebTransformer server 20, which in turn sends request todownload the source online document to the web site 30. When the sourceonline document 50 is returned from the web site 30 to theWebTransformer server 20, the server performs the source documenttransformation and document digest creation according to the scriptprepared by the user and uploaded to the server and the resultingdocument digest 40 is sent back to the requesting client.

[0137] The client-server WebTransformer can be used in the followingsituations:

[0138] 1. WebTransformer client is located on a small-screen handheld orwireless device. Wireless provider or individuals themselves setup aWebTransformer server and put their WebTransformer script on it. Thewireless device client connects to this server to get the documentdigests This setup is described in more detail below.

[0139] 2. A company sets up a firewall that does not give any access tothe outside Internet to company employees but uses Internet web sites tofeed only the approved information to the employees. The company sets upWebTransformer server 20 and puts on it a number of WebTransformerscripts that extract and reformat the approved data from the Internet.The access to the outside Internet is closed to employees, but they canuse their WebTransformer clients 30 to view the approved documentdigests from the WebTransformer server 20.

[0140] 3 A company sets up WebTransformer server that monitors aparticular web page or assortment of web pages that are of interest tothe company. The documents digests extracted by WebTransformer scriptsare read by robotic client that converts them to text and stores theminto database This is a good way to arrange important data extractionthrough the web site

[0141] Handheld and Wireless Devices

[0142] The document digest produced by a WebTransformer script isusually smaller than the original document and it usually does notcontain computationally intensive and bandwidth intensive multimediaelements such as graphics, sounds, scripts, and applets. This lowersscreen size, bandwidth and processing power requirements for user agentsthat receive and display such document digests

[0143] Since handheld and wireless devices such as screen cell phones,pagers and personal digital assistants (PDAs) all have small screen andmost of them also have limitations in available bandwidth and processingpower, it is more appropriate to use such devices for online documentmonitoring using the present invention than to use such devices for webbrowsing. A complete web browser for such devices, even if developed, isnot be very practical, because most web pages are designed for largedesktop screens and not for small screens used in handheld and wirelessdevices. Therefore viewing web page designed for the big screen will notbe convenient on the small screen of a handheld device, and developing asmall-screen version of every web page out there is impractical.

[0144] The present invention provides a way of monitoring smallfragments of larger web pages on a handheld or wireless device with asmall screen A preferred scheme of using the present invention tomonitor the fragments of the web pages on small-screen device withlimitation in available bandwidth and computational power is presentedat FIG. 7.

[0145] In this scheme, a user creates scripts according to the presentinvention on his or her desktop computer 60 on FIG. 7. The createdscripts are uploaded to the central server computer 20 of the wirelessprovider over the user desktop to wireless provider connection 70 whichtypically is a dialup connection.

[0146] The handheld device 10 can communicate with the central wirelesscomputer 20 over a relatively slow wireless or similar link 40. Thehandheld device can download a list of available WebTransformer scriptsthat the user uploaded to the central computer. On instruction from theuser, the handheld device 10 can ask the central computer 20 to run thetransformation script and to send the digest document produced by thescript to the handheld device where they are shown as the documentdigests 11 and 12.

[0147] This way communications that require potentially high bandwidth,such as downloading the source online document from the web site 30 tothe central computer 20 will occur over the fast communication link 50that typically exists between server computers, all operations relatedto the source page downloading and transformation that potentiallyrequire higher computing power will occur on the central computer 20,and the handheld device 10 will only need to download a small digestdocument over the slow link 40 and it will show the smaller digestdocument 11 or 12 on its small screen

[0148] Also, the user can ask a central server computer 20 to send tothe user a target document only when it changes. This way, even lessbytes have to be sent between the central computer and the wirelessdevice.

[0149] Additional Features

[0150] The following features, while not strictly necessary inunderstanding or applying the ideas of the present invention, areadditional aspects of the present invention.

[0151] 1. Several source online document fragments can be can be used tocreate one target document. In this case, according to the presentinvention, the transformation script may contain several sequences of“Go To URL” commands, “Go To Child” commands, and “Copy Fragment”commands that assemble document fragments from several source documentsto one target document.

[0152] Also in this case target window contains target placeholders thatdesignate the locations to which a particular source documents fragmentis copied to. Each target placeholder has a distinctive ID and “CopyFragment” commands refer to this ID.

[0153] 2 The target window may contain not only target placeholders butalso arbitrary “document frame” content created by the user. Suchadditional content may be used to mark the target placeholders or thewhole target document or to additionally format the copied sourcedocument fragments.

[0154] This content is created by the user with help of target documenteditor. Any HTML editor can be used as a target document editor. Forinstance, Microsoft FrontPage can be used as a target template editor.

[0155] 3. A WebTransformer script created according to the presentinvention can be used as a means of addressing a fragment of onlinedocument. WebTransformer script according to the present invention canbe displayed on a web site or sent by e-mail. When the user clicks theWebTransformer script displayed on a web site or as an e-mailattachment, the WebTransformer is automatically invoked and it displaysthe online document fragment designated in the script. Monitoring of thedisplayed fragment starts automatically after the initial display of thefragment.

[0156] 4 According to the present invention, a source document fragmentto be monitored by user can be addressed not only by a sequence of “GoTo Child” commands that follow the path from the source document root tothe user-selected tree node, but also by assigning a distinct ID to thenode and by using a single “Find by ID” command that finds document treenode uniquely identified by a given ID. This approach requirescooperation from the online document maintainers, because they have toassign distinct IDs to every online document element that is likely tobe monitored. They can assign such IDs, for instance by using IDattribute of HTML 4 0.

[0157] 5. According to the present invention, the WebTransformer can beinstructed by the user to automatically compare the current and theprevious version of the target online document, so that if they differ,the user is notified by generating alert. Such alert may. results insending e-mail message to the user-specified recipient or in executing aprogram or script prepared by the user Also if the target document afterbeing converted to plain text can be interpreted as a number, then onecan generate alerts based on whether such number satisfiesuser-specified alert condition.

[0158] The invention being thus described, it will be evident that thesame may be varied in many ways. Such variations are not to be regardedas a departure from the spirit and scope of invention and all suchmodifications are intended to be included within the scope of theclaims.

The invention is claimed as follows:
 1. A method for extracting digestsfrom structured online documents and monitoring the said digests,comprising the steps of: recording the script that consists of commandsthat include loading the online document in the source window,navigating the three of the source online document, and copying fragmentof the online document to the target window; saving the script in acomputer-readable medium; and replaying the script using a computer toautomatically generate an updated target document from an updated sourcedocument.
 2. A method as claimed in claim 1, wherein the structuredonline document from which information is to be extracted include anydocument that has hierarchical internal structure that can berepresented by a tree.
 3. A method as claimed in claim 1, wherein methodemploys a visual programming technique.
 4. A method as claimed in claim3, wherein the visual programming technique provides for at least twowindows being logically present for each script: a first window as asource window and a second window as target window.
 5. A method asclaimed in claim 4, wherein at time of script recording user can selecta fragment of a source online document shown in a source window byclicking the said fragment and to request creation of a script thatfinds the selected fragment in the current and subsequent versions ofthe source document.
 6. A method as claimed in claim 5, wherein at thescript creation time a sequence of commands that comprise the scriptthat extracts the selected source document fragment is generated.
 7. Amethod as claimed in claim 6, wherein the generated sequence of commandsincludes document tree navigation commands that lead from the root nodeof the source document tree to the node of the source document tree thatrepresents the fragment selected by user.
 8. A method as claimed inclaim 7, wherein the generated sequence of commands further includes“Copy Fragment” command that causes transfer of contents of the selectedsource document fragment from the source window to the target window. 9.A method as claimed in claim 8, wherein the visual programming techniqueallows for replaying of the memorized commands at a subsequent time toautomatically create a digest of a new version of the specified onlinedocument.
 10. A method as claimed in claim 9, wherein the digest istypically smaller than the source online document from which it is made,and the digest is a fragment of a course document that is typically madeby the user to omit unnecessary and irrelevant graphics and textelements often present in online document.
 11. A method as claimed inclaim 1, wherein the script can be automatically replayed atpredetermined time intervals.
 12. A method as claimed in claim 1,further comprising during the step of recording of commands to form ascript, identifying a portion of at least one further structureddocument to be copied to the target document and identifying aplaceholder in the target document to which the said fragment is to becopied.
 13. A method as claimed in claim 1, wherein the copied documentfragment is represented by a node in a tree that represents a structuredonline document.
 14. A method as claimed in claim 1, further comprisingduring the step of recording of commands to form a script, recordingnavigation commands that navigate the structured document browser to thesource structured document.
 15. A method for extracting digests fromstructured online documents, and automatic monitoring of the saiddigests based on visual programming of document tree navigation andtransformation, whereby structured online document is any document thatcan be stored in a computer and that has a hierarchical structure thatcan be represented by a tree, comprising the steps of: recording ofcommands to form a script that identifies a fragment of a structureddocument to be copied from source document to target document; savingthe said script in a computer-readable medium; and replaying the scriptusing a computer to automatically generate an updated target documentfrom an updated source document.
 16. A method as claimed in claim 15,wherein a technique is provided whereby for each script at least twowindows are logically present: a first window as a source window and asecond window as a target window, and wherein the technique allows auser to select a fragment of an online document shown in a source windowand to create a script that copies the selected fragment to the targetwindow.
 17. A method as claimed in claim 16, wherein the techniquegenerates a sequence of the source document tree navigation commandsthat lead from the root node of the source document tree to the node ofthe source document tree that represents the document fragment selectedby user.
 18. A method as claimed in claim 17, wherein the techniquefurther includes “Copy Fragment” commands that cause the assembly of adocument digest in the target window.
 19. A method as claimed in claim18, wherein the technique enables replaying of the memorized commands ata subsequent time to create a digest of a new version of the specifiedonline document.
 20. A method as claimed in claim 19, further comprisingduring the step of recording of commands to form a script, identifying aportion of at least one further structured source document to be copiedto the target document.